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Abstract — In this paper, the feasibility and efficiency of non- 
causal prediction in a P-frame is examined, and based on the 
findings, a new P-frame coding scheme is proposed. Motion- 
compensated inter-frame prediction, which has been used 
widely in low-bit-rate television coding, is an efficient method 
to reduce the temporal redundancy in a sequence of video 
signals. Therefore, the proposed scheme combines motion 
compensation with non-causal prediction based on an inter- 
polative, but not Markov, representation. However, energy 
dispersion occurs in the scheme as a result of the interpola- 
tive prediction transform matrix being non-orthogonal. To 
solve this problem, we have introduced a new conditional pel 
replenishment method. On the other hand, Rotation Scan- 
ning is also applied as feedback quantization is the quantizer 
in this paper. Simulation results show that the proposed cod- 
ing scheme achieves an approximate 0.3-2 dB improvement 
when the entropy is similar to the traditional hybrid coding 
method. 

Index Terms — non-causal prediction, inter-frame coding, con- 
ditional pel replenishment. 

I. Introduction 

Motion-compensated (MC) image coding, which takes 
advantage of frame-to-frame redundancy to achieve a high 
data compression rate, is one of the most popular inter-frame 
coding techniques [ 1 ] - [2] . For the H.26x family of video cod- 
ing standards, a motion estimation (ME)/MC coding tool that 
is combined with an orthogonal transform (OT) [15]- [26] , such 
as a discrete cosine transform (DCT), has been introduced. 
This tool now plays an important role in the inter-frame cod- 
ing field [12]-[14], [27][28]. According to the conditional re- 
plenishment pixel method and quantization control in the DCT 
coefficient domain, the H.26x standards gain considerable 
coding efficiency 
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and transmission bandwidth reductions by applying this 
method. 

Conversely, we have developed a hybrid I-frame en- 
coding method based on non-causal interpolative prediction 
and differential feedback quantization that utilizes the intra- 
frame spatial correlation [3]. To verify the efficiency of that 
hybrid coding method, we compared the method with H.264 
in I-frames. As a result, an approximate 0.5-5 dB improvement 
was found by applying the developed method [3]. 

In this paper, a new configuration for P-frame coding is 
presented. In designing this hybrid coding scheme, we show 
that orthogonal transforms do not need to be considered as 
constraints. 

JJ. Proposed Scheme 

The proposed coding scheme is shown in Figure 1 . In 
this model, MC predictive coding is first performed, and then 
the residual signal is encoded by an interpolative prediction 
(IP) method based on an 8 x 8 block [3]. We term this hybrid 
coding method the "MC+IP synthesis configuration". 

A. Motion-compensated Predictive Coding 

MC predictive coding in the proposed method is iden- 
tical to that used for inter-frame MC prediction of P-frames in 
the H.264 video coding standard. Here, the number of 
reference frames is set equal to one. 

B. Interpolative Prediction 

The residual signal after motion compensation is coded 
by an IP method based on 8 x 8 blocks. The encoding matrix 
C used in this IP is similar to that presented in Ref. 3 except 
for the elements that correspond to four corner pixels of block 
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Figure 1. Proposed coding Scheme 

41 



-vV ACEEE 



Short Paper 



ACEEE Int. J. on Signal & Image Processing, Vol. 4, No. 1, Jan 2013 



flN frame MC+IP 
>8 «N+1 frame 




Motion-Compensate 



Figure 2. MC predictive coding with interpolative prediction 

in the inter-block processing. Following Ref. 3, we have shown 
that non-causal prediction interpolation can be realized as a 
"transform coding", so the interpolation part can be 
considered as a "substitute" for an OT such as DCT. 

The configuration of MC predictive coding with 
interpolative prediction is shown as Figure 2. 

In our configuration, the predictive error Yn can be 
expressed as (1). 

}>C J x^-qx/(^ 1 ) = C J x(T ff -/(0) (1) 

Xn is the input 8x8 block signal plus four corner pixels 
and with 68 xl vector form after last order scanning; X'n-1 is 
the reference vector in the last reconstruct frame at the ex- 
actly same position;/!' ' ) means MC function, so f(X'n-l) 
is the vector of X'n-1 after MC processing; C { is the 68x68 
predictive matrix, which can be expressed as follows: 
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Al, A2 and A3 are 8x8 matrices and shown as (3) ~ (5) 
Equations. B is a 4x64 matrix and (6) shows its transpose 
matrix, only at (1,1), (2,8), (3, 57) and (4,64) position, the value 
equal to -1, at other positions the value equal to 0. O stands 
for the zero matrices. 
Optimal Quantization Scheme 

C. Optimal Quantization Scheme 

The difference signals output by the interpolative pro 
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cess, which correspond to IP errors of the MC residuals, are 
sequentially input to the feedback quantizer [4]. Accordingly, 
coding errors resulting from the power expansion in the in- 
ter-block processing, due to having a non-orthogonal sys- 
tem, can be solved [4]-[5]. 

D. Conditional Pel Replenishment 
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Figure 3. Conditional pel replenishment 

As already mentioned, because an OT is not employed in 
our method, energy is not concentrated, but is distributed 
throughout an entire block. Consequently, determining 
whether pixels should be quantized is an issue. For this reason, 
we introduced conditional pel replenishment to the scheme. 

As shown in Figure 3, the data of a previously decoded 
frame W, , which are used for motion compensation, can 
also be used to perform reference of conditional repleni- 
shment pixel control before the current pixel data is quantized. 
As a result of pel replenishment, the transmission bandwidth 
is constrained. 

Specifically, the decoded data of the previous frame are 
also processed by IP based on 8 x 8 blocks to obtain a set of 
reference values; and these values are compared with the 
predefined pixel threshold (PTH). As the reference values are 
obtained from the decoded data, this conditional pixel 
replenishment can be realized without additional overhead 
information. 

On the other hand, the differences of IP outputs between 
current and previous frame (reference values) are then 
compared with the 4x4 sub-block threshold (BTH) (obtained 
by preliminary experiments). Thus, we determine whether the 
pixels should be quantized in the 4x4 sub-block of the 8x8 
block under consideration in the current frame. At this 
moment, since the reference values are obtained from the 
decoded data, motion vector information (which the decoder 
has already gotten) and current 4x4 sub-block data (this 
information are not transferred to decoder yet), it is necessary 
to add the information for each 4x4 sub-block to certify 
whether it should be quantized or not. As a result, this 
conditional pixel replenishment can be realized with 1 bit (ON/ 
OFF) additional overhead information for every 16 pixels (4 x 
4 sub-block). 

A threshold replenishment algorithm for adaptive vector 
quantization was first proposed by Fowler [6] in 1998, and 
has subsequently been used in various coding technology 
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[7]-[10]. The proposed conditional pixel replenishment 
without codebook used in our method is based on that 
algorithm, because quantization is not performed on vectors, 
but on each individual pixel in an 8 x 8 block. Furthermore, 
the reference values are output of IP, not the distortion 
measure between code-vector and quantization input vector 
[6]. Therefore, computation of distortion measure for each 
block is not used. It means computational time of proposed 
method is less than Ref.6. However, the threshold values 
here are predefined and must be changed according to each 
frame of a video sequence. Improving the threshold selection 
process is considered to be an area of future work. 

E. Rotation Scanning 

In this paper, in order to realize the replenishment of pixels 
in the spatial domain, we proposed a new approach to improve 
the image quality. Input 8x8 block signal is reordered to 
adapt this sub-block system. 

In feedback quantization system, the power of coding 
error can be expressed as (7) 

Here,/; is the feedback coefficient for one 8x8 block and 
Q 2 p is the power of quantized error. When 4x4 sub-block 
conditionnal replenishment pixel is performed, non-significant 
sub-block will not be quantized, as a result, the power of 
quantized error in (6) is changed to the power of predict error. 
Generally, the power of quantized error is smaller than predict 
error. Therefore, if we could reduce the value of f at the non- 
significant sub-block position, we could suppress the 
increase of coding error power as show in (7). According to 
Ref .4, the value of f. is defined by three matrices: the predictive 
matrix C/, scanning order matrix P, and transform matrix D. 
Because OT is not applied in our system, matrix D has been 
determined; predictive matrix C, defines the predict error, also 
has been determined. 

When the scanning order is 4x4x4, as shown in Figure 4, 
8x8 block / value is shown in Figure 5. 
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Figure 4. 4x4x4 scanning order 

Each sub-block' s^£ is shown as follow: 

#1 Stfi) =19.4527 

#2 £ (f$ =26.2304 

#3 £ (f$ =26.8412 

£4 I (fi) -40.1022 
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(8) 



Therefore, after deciding whether current 4x4 sub-block be- 
ing quantized or not, for example, if #4 sub-blocks do not 
need to be quantized, as X (fi) of #4 sub-block is the largest 
value of all sub-block, and according to the definition of 
distortion in feedback quantization system, as shown at (7), 
it is necessary to reorder the input signal to make sure non- 
significance sub-block could be quantized as early as pos- 
sible. In Table I, shows this processing. 



i.soo 


0.800 


1.-29 


1.567 


2.269 


2.461 


2.627 


2.913 


G.BOO 


0.878 


1.020 


i.;i: 


1 (IT} 


1.051 


1.050 


0.844 


1.441 


1.017 


1 111 


1.2 92 


\m 


1J52 


1.523 


1.548 


1513 




1.528 


1J55 


1.437 


1.-40 


1.382 


2.059 


2.745 


1.023 


1.344 


1 442 


1.5~S 


1.489 


1.-05 


2 432 


2.550 


1 0^2 


1.537 


1J74 


1594 


1.504 


1.414 


2711 


2.759 


1.023 


1.534 


1J67 


1.504 


l.-SS 


1.426 


2.920 


5J32 


0.846 


1.554 


2.059 


2.4S1 


2.768 


5.220 


8.1.63 



Figure 5. 8x8 block fi value 
Table I. Rotation Processing For All Situations 
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In Table I, we showed that depending on the insignifi- 
cant / significant (expressed as and 1 in various positions 
of 4 x4 sub-block) appearance situation, how the rotation pro- 
cessing should be assigned. L means left rotation operation 
(counterclockwise rotation); while R means right rotation 
(clockwise); R 2 means right rotation twice and "none" means 
no rotation operation. However, there are two special circum- 
stances in this Table, which marked with gray background: 
0110 — ► 0111 and 1001 — >0111. It means after rotation 
operation, the last sub-block, #4, has been set to 1 manda- 
tory on the basis of pre-experiment result. 

By this rotation operation, the effect of improving coding 
efficiency is shown in Table U. 
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F. Features of Proposed Scheme 

The proposed inter-frame coding scheme has three char- 
acteristic features: 

• An OT is not used. 

• Conditional pel replenishment is performed without 
additional overhead information. 

• A new hybrid coding framework, MC+IP combined with 
feedback quantization, is employed. 

Table II. Coding Efficiency Improvement By The Rotation Operation 
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An OT is not used in our coding scheme; instead an IP 
method is adopted. In fact, an OT does not utilize the rela- 
tions between pixels but merely transforms the signal from 
the spatial to frequency domain. In contrast, an IP method 
can compress the signal by eliminating the correlation be- 
tween pixels within the frame. Generally, the MC prediction 
error is independent of time; however, a spatial correlation 
still exists. Accordingly, we have replaced the OT by an IP in 
our method, because in Ref. 3 we showed that IP can be 
achieved as a "transform coding". 

Conversely, non-orthogonally of the IP transform matrix 
means the power expansion problem, which means coding 
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errors will be expanded when decoded, exists in the proposed 
method. As a result, feedback quantization is necessary. 

III. Simulation 

We now present simulation results obtained by using the 
proposed P-frame coding scheme and show a comparison 
between our method and the H.264 baseline [29] . To eliminate 
the influence of the I-frame, since its decoded image is used 
as the first reference frame when performing motion 
compensation, the first frame of the test sequence for the 
two methods is coded by the H.264 1-frame baseline under 
the same parameters values. 

The first seven frames two CIF (352 x 288) test video 
sequence were served, foreman and bus, obtained from YUV 
Video Sequences website [11]. The MC coding parameters 
(both methods are the same at this point) are set as follows: 

• Search range: 32 pixels. 

• Total number of reference frames: 1 . 

• ME scheme: fast full search. 

• PSliceSearch8x8: 1 (used, all other types are not used). 

• Disablelntralnlnter: 1 (Disable Intra mode for inter slice). 

• Rate-distortion-optimized mode decision: used. 

Besides these parameters, the threshold values of the 
proposed scheme are adapted according to the input frames. 

A. Comparison of Prediction Errors 

Table III. Statistic Values Of Residual Signals 
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A comparison between the prediction errors in the results 
is first shown. Through this comparison, we can see the 
distribution of the errors obtained by the proposed scheme 
and whether spatial correlation exists in the signal after motion 
compensation. Table b! lists several statistical values for both 
methods that reflect the distribution of their residual signals. 

In this table, "PM" stands for "proposed method"; 
Entropy is calculated based on Shannon theory and "Average 
Error" is the average value of two signal powers. Number of 
O's means how many pixels are accurately predicted. 

Figure 6 shows the prediction errors of the first P-frame at 
each pixel position when using the first frame of foreman, cif 
as a test image. Here, the x-axis expresses the pixel position 
and the y-axis expresses the value of prediction error. 

The distribution of the prediction errors for the proposed 
method is clearly more concentrated around zero than that 
for H.264, and the residual signal power is about 37.6% lower 
under the proposed method. Therefore, we consider that our 
scheme provides improved coding efficiency if an appropriate 
quantization method is employed. 

B. Comparison of Coding Efficiency 

Next, we show the coding results of the proposed method 
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and H.264. As stated before, the threshold values in our 
method must be changed for each frame, and these thresh- 
olds are obtained by a preliminary experiment. Since PTH 
and BTH are typically within a certain range, the results pre- 
sented here are also limited. For this reason, the conditional 
pel replenishment used here has the potential for improve- 
ment. 
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Figure 6. Comparison of prediction errors 
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Figure 7. Comparison of coding effciency for foreman. cif 

Figure 7-10 show comparisons between the methods' 
coding efficiency for the test sequences of foreman, bus, 
flower and highway, respectively, nh and nv means the cor- 
relation of test image in the horizontal and vertical direction 
respectively. 

In these plots, the horizontal axis expresses the frame 
number of the coded frames (0-20 frames for foreman. cif and 
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0-10 frames for other test sequences) and the vertical axes 
express the entropy (bit/pixel; upper plot) and peak signal- 
to-noise ratio (PSNR; dB; lower plot) of each test image. Al- 
though the number of bits required by the proposed method 
(its entropy) is approximately equal to that of H.264, PSNR 
for the proposed method is consistently higher (the average 
improvement is about 0.3-2 dB). 
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Figure 10. Comparison of coding efficiency for highway.cif 

Table IV shows the coding results when the proposed 
method is applied to the bus CIF video sequence. Here, Q(.) 
is the average entropy of all pixels in the frame; H(.) expresses 
the overhead (motion vectors) entropy. 

Table IV. Coding Results Of Proposed Method 
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Table V. Entropy Of Motion Vectors 
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C. Comparison of Motion Vectors 

Table V lists the entropy of motion vectors in each frame 
for the two methods. Moreover, Figure 1 1 shows the decoded 
frame obtained by the two methods for four test sequences, 
where lines denote the motion vectors. The proposed method 
clearly contains smaller amount of motion-vector informa- 
tion than H.264. Furthermore, corresponding decoded frames 
without motion vectors are shown in Figure 12. 

D. Effect of Adaptive coding between intra/inter mode 

In Figure 13, shows the simulation results of the adaptive 
coding between inter / intra coding mode. Insertion of the 
PinP (Picture in Picture configuration, as shown in Figure 14) 
is set to start at the first frame and completed at the eighth 
frame. In Figure 13, PSNR of "inter mode only" means the 
value of PSNR when all frames from the second one are forced 

©2013 ACEEE 
DOr.01.IJSIP.4.1.1137 



38 
37.5 
~ 37 
ffi 36.5 
3 36 
pi 35.5 
Z 35 

°- 34 
33.5 
33 



fez 



■ Adaptive Coding of H.264 

— 4 — Adaptive Coding of PM 
^^^^"inter mode oniy 

Trime number 



2 4 6 S 10 
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Figure 14. PinP Configuration 

to be codes as inter frame mode and the amount of transmis- 
sion bit is almost the same to the proposed adaptive scheme. 

If the coding mode is fixed to inter coding, as shown here, 
the improvement of coding efficiency begin at the second P- 
frame and after coding 2 frames, the efficiency is closer to the 
adaptive system eventually. The arrow in this figure expressed 
how much improvement can be achieved by proposed 
adaptive scheme. 

On the other hand, because adaptive intra/inter mode 
selection method with overhead information has been applied 
in H.264 standard, when using the PinP test sequence shown 
in Figure 14, significant decrease in PSNR does not befall. 

In addition, in the H.264 scheme, because intra-frame 
coding mode can be selected at any time throughout the 
entire coding period, the overhead for every macro-block is 
about 0.01 bit/pixel more than proposed method. However, 
the disadvantage of proposed method is that adaptive mode 
selection is not flexible enough. In the future, authors will do 
the improvement at this point. 

Conclusions 

In this paper, we proposed a new inter-frame coding 
scheme in which the OT (e.g., a DCT) used in conventional 
hybrid coding schemes is replaced by a non-causal IP. Appli- 
cation of this IP can potentially reduce the amount of signal 
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Figure 1 1 . Comparison of motion vectors for the decoded frame 
obtained by (left) H.264 and (right) the proposed scheme 

power a priori. Since IPs make use of the spatial correlations 
between pixels, we consider them more effective than OTs 
(which merely perform domain transforms) for residual data 
that has been compressed after MC inter-frame prediction. 
However, combining our method with an OT is also of great 
research interest, as shown in Section II of this paper. As a 
result, we have also introduced conditional pel replenish- 
ment to our scheme. Moreover, no additional overhead infor- 
mation is added by employing this method. Our model thus 
has the three characteristic features shown in Section HE A 
comparison between the simulation results of the proposed 
method and H.264 in Section III, showed that when using 
four test sequences a, the proposed scheme achieves an 
approximate 0.3-2 dB improvement for an entropy similar to 
that of the H.264 baseline level. 
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Figure 12. Comparison of decoded frame obtained by (left) H.264 
and (right) the proposed scheme 

As future work, the conditional pel replenishment method 
utilized in our scheme must be improved, and this should be 
addressed first. Other areas that should also be explored are 
whether the proposed scheme can maintain high coding 
efficiency if the test sequence becomes large. 

In conclusion, we have introduced a different approach 
for P-frame hybrid coding that utilizes the spatial correlation 
of the MC residual signal. Since our hybrid video coding 
method achieved high coding efficiency without employing 
an OT, we have shown the feasibility of non-orthogonal 
transforms for effective coding. 
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